42 research outputs found

    On Learning Vector Representations in Hierarchical Label Spaces

    Full text link
    An important problem in multi-label classification is to capture label patterns or underlying structures that have an impact on such patterns. This paper addresses one such problem, namely how to exploit hierarchical structures over labels. We present a novel method to learn vector representations of a label space given a hierarchy of labels and label co-occurrence patterns. Our experimental results demonstrate qualitatively that the proposed method is able to learn regularities among labels by exploiting a label hierarchy as well as label co-occurrences. It highlights the importance of the hierarchical information in order to obtain regularities which facilitate analogical reasoning over a label space. We also experimentally illustrate the dependency of the learned representations on the label hierarchy

    Learning Label Structures with Neural Networks for Multi-label Classification

    Get PDF
    Multi-label classification (MLC) is the task of predicting a set of labels for a given input instance. A key challenge in MLC is how to capture underlying structures in label spaces. Due to the computational cost of learning from all possible label combinations, it is crucial to take into account scalability as well as predictive performance when we deal with large scale MLC problems. Another problem that arises when building MLC systems is which evaluation measures need to be used for performance comparison. Unlike traditional multi-class classification, several evaluation measures are often used together in MLC because each measure prefers a different MLC system. In other words, we need to understand the properties of MLC evaluation measures and build a system which performs well in terms of those evaluation measures in which we are particularly interested. In this thesis, we develop neural network architectures that efficiently and effectively utilize underlying label structures in large-scale MLC problems. In the literature, neural networks (NNs) that learn from pairwise relationships between labels have been used, but they do not scale well on large-scale label spaces. Thus, we propose a comparably simple NN architecture that uses a loss function which ignores label dependencies. We demonstrate that simpler NNs using cross-entropy per label works better than more complex NNs, particularly in terms of rank loss, an evaluation measure that takes into account the number of incorrectly ranked label pairs. Another commonly considered evaluation measure is subset 0/1 loss. Classifier chains (CCs) have shown state-of-the-art performance in terms of that measure because the joint probability of labels is optimized explicitly. CCs essentially convert the problem of learning the joint probability into a sequential prediction problem. Then, the task is to predict a sequence of binary values for labels. Contrary to the aforementioned NN architecture which ignores label structures, we study recurrent neural networks (RNNs) so as to make use of sequential structures on label chains. The proposed RNNs are advantageous over CC approaches when dealing with a large number of labels due to parameter sharing effects in RNNs and their abilities to learn from long sequences. Our experimental results also confirm that their superior performance on very large label spaces. In addition to NNs that learn from label sequences, we present two novel NN-based methods that learn a joint space of instances and labels efficiently while exploiting label structures. The proposed joint space learning methods project both instances and labels into a lower dimensional space in a way that minimizes the distance between an instance and its relevant labels in that space. While the goal of both joint space learning methods is same, they use different additional information on label spaces during training: One approach makes use of hierarchical structures of labels and can be useful when such label structures are given by human experts. The other uses latent label spaces learned from textual label descriptions so that we can apply it to more general MLC problems where no explicit label structures are available. Notwithstanding the difference between the two approaches, both approaches allow us to make predictions with respect to labels that have not been seen during training

    Large-scale Multi-label Text Classification - Revisiting Neural Networks

    Full text link
    Neural networks have recently been proposed for multi-label classification because they are able to capture and model label dependencies in the output layer. In this work, we investigate limitations of BP-MLL, a neural network (NN) architecture that aims at minimizing pairwise ranking error. Instead, we propose to use a comparably simple NN approach with recently proposed learning techniques for large-scale multi-label text classification tasks. In particular, we show that BP-MLL's ranking loss minimization can be efficiently and effectively replaced with the commonly used cross entropy error function, and demonstrate that several advances in neural network training that have been developed in the realm of deep learning can be effectively employed in this setting. Our experimental results show that simple NN models equipped with advanced techniques such as rectified linear units, dropout, and AdaGrad perform as well as or even outperform state-of-the-art approaches on six large-scale textual datasets with diverse characteristics.Comment: 16 pages, 4 figures, submitted to ECML 201

    Sensorless Control of Surface-Mount Permanent-Magnet Synchronous Motors Based on a Nonlinear Observer

    No full text
    International audienceA nonlinear observer for surface-mount permanent-magnet synchronous motors (SPMSMs) was recently proposed by Ortega et al.(LSS, Gif-sur-Yvette Cedex, France, LSS Internal Rep., Jan. 2009). The nonlinear observer generates the position estimate hat(theta) via the estimates of sin theta and cos theta. In contrast to Luenberger-type observers, it does not require speed information, thus eliminating the complexity associated with speed estimation errors. Further, it is simple to implement. In this study, the nonlinear observer performance is verified experimentally. To obtain speed estimates from the position information, a proportional-integral (PI) tracking controller speed estimator was utilized. The results are good with and without loads, above 10 r/min

    Deciphering the communicative code in speech and gesture dialogues by autoencoding hypernetworks

    Get PDF
    Nam J, Bergmann K, Waltinger U, Kopp S, Wachsmuth I, Zhang B-T. Deciphering the communicative code in speech and gesture dialogues by autoencoding hypernetworks. Presented at the ESLP 2011: Embodied and Situated Language Processing, Center for Interdisciplinary Research (ZiF), Bielefeld University, Germany

    Long-term survival benefits of intrathecal autologous bone marrow-derived mesenchymal stem cells (Neuronata-R®: lenzumestrocel) treatment in ALS: Propensity-score-matched control, surveillance study

    Get PDF
    ObjectiveNeuronata-R® (lenzumestrocel) is an autologous bone marrow-derived mesenchymal stem cell (BM-MSC) product, which was conditionally approved by the Korean Ministry of Food and Drug Safety (KMFDS, Republic of Korea) in 2013 for the treatment of amyotrophic lateral sclerosis (ALS). In the present study, we aimed to investigate the long-term survival benefits of treatment with intrathecal lenzumestrocel.MethodsA total of 157 participants who received lenzumestrocel and whose symptom duration was less than 2 years were included in the analysis (BM-MSC group). The survival data of placebo participants from the Pooled-Resource Open-Access ALS Clinical Trials (PROACT) database were used as the external control, and propensity score matching (PSM) was used to reduce confounding biases in baseline characteristics. Adverse events were recorded during the entire follow-up period after the first treatment.ResultsSurvival probability was significantly higher in the BM-MSC group compared to the external control group from the PROACT database (log-rank, p < 0.001). Multivariate Cox proportional hazard analysis showed a significantly lower hazard ratio for death in the BM-MSC group and indicated that multiple injections were more effective. Additionally, there were no serious adverse drug reactions found during the safety assessment, lasting a year after the first administration.ConclusionThe results of the present study showed that lenzumestrocel treatment had a long-term survival benefit in real-world ALS patients

    Semi-Supervised Neural Networks for Nested Named Entity Recognition

    Get PDF
    In this paper, we investigate a semi- supervised learning approach based on neu- ral networks for nested named entity recog- nition on the GermEval 2014 dataset. The dataset consists of triples of a word, a named entity associated with that word in the first-level and one in the second-level. Additionally, the tag distribution is highly skewed, that is, the number of occurrences of certain types of tags is too small. Hence, we present a unified neural network archi- tecture to deal with named entities in both levels simultaneously and to improve gen- eralization performance on the classes that have a small number of labelled examples

    Learning Label Structures with Neural Networks for Multi-label Classification

    No full text
    Multi-label classification (MLC) is the task of predicting a set of labels for a given input instance. A key challenge in MLC is how to capture underlying structures in label spaces. Due to the computational cost of learning from all possible label combinations, it is crucial to take into account scalability as well as predictive performance when we deal with large scale MLC problems. Another problem that arises when building MLC systems is which evaluation measures need to be used for performance comparison. Unlike traditional multi-class classification, several evaluation measures are often used together in MLC because each measure prefers a different MLC system. In other words, we need to understand the properties of MLC evaluation measures and build a system which performs well in terms of those evaluation measures in which we are particularly interested. In this thesis, we develop neural network architectures that efficiently and effectively utilize underlying label structures in large-scale MLC problems. In the literature, neural networks (NNs) that learn from pairwise relationships between labels have been used, but they do not scale well on large-scale label spaces. Thus, we propose a comparably simple NN architecture that uses a loss function which ignores label dependencies. We demonstrate that simpler NNs using cross-entropy per label works better than more complex NNs, particularly in terms of rank loss, an evaluation measure that takes into account the number of incorrectly ranked label pairs. Another commonly considered evaluation measure is subset 0/1 loss. Classifier chains (CCs) have shown state-of-the-art performance in terms of that measure because the joint probability of labels is optimized explicitly. CCs essentially convert the problem of learning the joint probability into a sequential prediction problem. Then, the task is to predict a sequence of binary values for labels. Contrary to the aforementioned NN architecture which ignores label structures, we study recurrent neural networks (RNNs) so as to make use of sequential structures on label chains. The proposed RNNs are advantageous over CC approaches when dealing with a large number of labels due to parameter sharing effects in RNNs and their abilities to learn from long sequences. Our experimental results also confirm that their superior performance on very large label spaces. In addition to NNs that learn from label sequences, we present two novel NN-based methods that learn a joint space of instances and labels efficiently while exploiting label structures. The proposed joint space learning methods project both instances and labels into a lower dimensional space in a way that minimizes the distance between an instance and its relevant labels in that space. While the goal of both joint space learning methods is same, they use different additional information on label spaces during training: One approach makes use of hierarchical structures of labels and can be useful when such label structures are given by human experts. The other uses latent label spaces learned from textual label descriptions so that we can apply it to more general MLC problems where no explicit label structures are available. Notwithstanding the difference between the two approaches, both approaches allow us to make predictions with respect to labels that have not been seen during training
    corecore